Improved surname pronunciations using decision trees

نویسندگان

Julie Ngan

Aravind Ganapathiraju

Joseph Picone

چکیده

Proper noun pronuncia t ion genera t ion is a part icular ly chal lenging problem in speech recognition since a large percentage of proper nouns often defy typical letter-to-sound conversion rules. In this paper, we present decision tree methods which outperform neural network techniques. Using the decision tree method, we have achieved an overall error rate of 45.5%, which is a 35% reduction over the previous techniques. Our best system is a binary decision tree that uses a context length of 3 and employs information gain ratio as the splitting rule.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effects of Speaking Rate and Word Frequency on Conversational Pronunciations

The possible set of pronunciations in continuous speech corpora change dynamically with many factors. Two variables , speaking rate and word predictability, seemed to be promising candidates for integration into dynamic ASR pronunciation models; however, our initial eeorts to incorporate these factors into phone-level decision tree models met with limited success. In this paper, we connrm the i...

متن کامل

Multi-level decision trees for static and dynamic pronunciation models

We have been focusing on improving pronunciation models for automatic transcription of television and radio news reports by modeling phone, syllable, and word pronunciation distributions with decision trees. These models were employed in two separate sets of experiments. First, decision trees facilitated selection of word pronunciations derived automatically from data for use in a standard spee...

متن کامل

Rescoring multiple pronunciations generated from spelled words

Building on earlier work [2], we show how a set of binary decision trees grown by means of the Gelfand-Ravishankar-Delp algorithm [8] can be trained to generate an ordered list of possible pronunciations from a spelled word. Training is carried out on a database consisting of spelled words paired with their pronunciations (in a particular language). We show how phonotactic information can be le...

متن کامل

Modeling pronunciation variation with context-dependent articulatory feature decision trees

We consider the problem of predicting the surface pronunciations of a word in conversational speech, using a model of pronunciation variation based on articulatory features. We build context-dependent decision trees for both phone-based and feature-based models, and compare their perplexities on conversational data from the Switchboard Transcription Project. We find that a fully-factored model,...

متن کامل

Flavoured acoustic model and combined spelling to sound for asymmetrical bilingual environment

The most common target of multilingual ASR aims at covering various speakers from various languages. The problem we address in this article is more specifically an asymmetrical bilingual scenario, where the same speaker may insert in his speech some foreign words using foreign pronunciations. This is a frequent situation for French as spoken in Canada, where English proper names are often spoke...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

Improved surname pronunciations using decision trees

نویسندگان

چکیده

منابع مشابه

Effects of Speaking Rate and Word Frequency on Conversational Pronunciations

Multi-level decision trees for static and dynamic pronunciation models

Rescoring multiple pronunciations generated from spelled words

Modeling pronunciation variation with context-dependent articulatory feature decision trees

Flavoured acoustic model and combined spelling to sound for asymmetrical bilingual environment

عنوان ژورنال:

اشتراک گذاری